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Abstract 

From three quasar samples with a total of 1038 objects in the redshift range 1.0 ^ 2.2 
we measure the variance cr^ of counts in cells of volume V^. By a maximum likelihood 
analysis applied separately on these samples we obtain estimates of cr^{i), with £ = Vu^^. 
The analysis from a single catalog for £ = AO Mpc and from a suitable average over 
the three catalogs for £ = 60, 80 and 100 h'^ Mpc, gives a^{i) = 0.46l° j^, 0.18t°;j^, 
0.05^q'q5 and 0.121q }27 respectively, where the 70% confidence ranges account for both 
sampling errors and statistical fluctuations in the counts. This allows a comparison of 
QSO clustering on large scales with analogous data recently obtained both for optical 
and IRAS galaxies: QSOs seem to be more clustered than these galaxies by a biasing 
factor hqso/hgai ~ 1.4 - 2.3. 

Subject headings: galaxies: clustering — quasars: general, surveys — large-scale struc- 
ture of the universe 
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1. Introduction 

Only in recent years the rapid growth of quasar surveys has made possible the 
analysis of their clustering properties. The availability of faint quasar samples, with 
their high surface density and size, has allowed a detailed study at scales r < 150 
Mpc (e.g. Shanks et al. 1987; Anderson, Kunth, & Sargent 1988; lovino & Shaver 1988; 
Andreani, Cristiani, & La Franca 1991). There is now substantial agreement on the 
results of the quasar two-point correlation function ^(r). This function is larger than 
unity at scales r < 10 Mpc, but the issue of its evolution with rcdshift is still matter 
of debate (lovino. Shaver, & Cristiani 1991; Boyle et al. 1991; Andreani & Cristiani 
1992). 

In this work we analyze QSO clustering by means of the variance of counts in cells. 
The advantage of this method is to provide information on clustering at various scales 
(i.e. various cell sizes), even when the volume covered by the catalog does not form a 
connected region; this is particularly useful for the available QSO samples. Statistics of 
counts in cells have been recently considered by various authors (e.g. Efstathiou et al. 
1990; Saunders et al. 1991; Loveday et al. 1992; Gaztanaga 1992; Bouchet et al. 1993), 
to obtain reliable constraints on the amplitude of galaxy clustering on different scales, 
through the variance, and on its possible deviations from a Gaussian behavior, through 
higher order moments such as the skewness. On the other hand, it is relatively easier, 
within a model for structure formation, to obtain theoretical predictions for the moments 
of counts in cells at various scales. Moreover, this kind of analysis, combined with similar 
studies performed for optical and IRAS galaxies, allows a direct determination of the 
biasing factor relating the clustering of QSOs with that of these classes of objects. 

After shot-noise subtraction, the variance of the continuous density fluctuation 
field, smoothed over the cell size £, is related to the spatial correlation function ^ (r) by 
the integral 

/■oo 

a^i) = / dr r^Cir) T,{t\ (1) 

where the window function Ti{r) takes into account the details of the cell geometry. 

For spherical cells of radius i?, one finds 

18 /"°° 3 
J'Rir) = ^ ii'(^) io(^r/i?) ^ -^^h{R - r), (2) 

where j(, are spherical Bessel functions of order I and 'dnix) is the Heaviside function 
(which is zero for a; < and one for x > 0). These relations allow to connect the results 



4 



of this work with previous data on the quasar-quasar correlation function. Actually, 
the two methods are complementary: the variance yields a more compact information 
on the clustering amplitude at the scale of the cell-size, while the correlation function 
gives a more detailed geometrical information. Being a volume average of the correlation 
function, the variance is characterized by a higher signal-to-noise ratio. 

2. Data Samples and Statistical Analysis 

Table I lists our database, which consists of eight different surveys already pub- 
lished. Table I reports the sample name (column 1), the effective covered area (column 
2), the limiting magnitude (column 3), the number of objects with Mb < —23 ^ (column 
4), within the assumed redshift range (column 5), and the number of objects between 
redshift 1 and 2.2 (column 6). 

The samples contain objects selected with different techniques: UV-excess, variabil- 
ity and slitless spectroscopy. Attention has been paid to use only complete catalogues, 
in order to minimize systematic biases. The optimal redshift range for our statistical 
study is 1 — 2.2: this is because the highest QSO number density is in this redshift range 
and the catalog completeness decreases beyond z = 2.2. 

In spite of the different catalog selection criteria, the high completeness in the 
considered redshift range allows to subdivide our database in three groups (named 
Sample A, B and C, in the following) on the only basis of their limiting magnitude; 
each of these samples will then be characterized by its own selection function. Sample 
A (510 objects): APM; Sample B (332 objects): Boyle et al. (1990), mj < 21 sample 
(hereafter HVI) from Hawkins & Veron (1993), ZiteUi et al. (1992) and Osmer & Hewett 
(1991), all cut at the limiting magnitude mj = 20.85, which leads to a 2.5% decrease 
in the number of objects; Sample C (122 objects): La Franca et al. (1992), mj < 19.5 
sample (hereafter HVII) from Hawkins & Veron (1993) and Crampton et al. (1989), 
cut at mj = 19.5, with a 34% decrease in the number. The actual limiting magnitude 
has been chosen slightly different for each sample, to take into account the different 
galactic extinction. The B magnitudes of La Franca et al. (1992) have been converted 
to J magnitudes, according to the relation mj = me — 0.05. 

To compute the moments of QSO counts in cells we first divide our three samples 

^ The absolute B magnitude is calculated assuming Hubble constant Hq = 100 h km s^^ Mpc^^, 
with h = 0.5 in a flat universe, with vanishing cosmological constant; fc— corrections are as in Cristiani 
& Vio (1990) and galactic extinction as in Burstein & Heiles (1982). 
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in shells with mean radii Va centered on the observer, further divided in Ma cells of 
volume Vu- Let Nj be the number of objects in the j-th cell (j = 1, . . . , Ma) of a given 
shell and Vj < Vu the cell volume actually included in the sample boundary, estimated 
by means of a standard Monte Carlo technique. Cells with Vj < 0.5Vu have not been 
used. 

In calculating the variance of counts in cells we had to account for the volume 
incompleteness of our samples. Following Efstathiou et al. (1990) we write 

{Na/v^Y [E. vi - 2 vii E, V, + (E, v^y/{Y.u y^Y] ' 

where = {AN/NaY = Vu Ej ^j/ Ej ^^e expected number of objects 

in a cell of volume Vu belonging to the a-th shell. The shot-noise subtraction in Eq.(3) 
may result in negative values for the estimates of a a (see, e.g.. Figure 1 in Efstathiou et 
al. 1990): this is because the Poisson model only approximately describes discreteness 
effects. In this sense one can safely state that Eq.(3) represents an estimate of the 
excess variance above the Poisson level; this also assumes that the expected variance is 
independent of the cell-volume, which is the case if the missing volume in incomplete 
cells is small, given that is likely to be a weak function of cell size. 

When using different catalogs grouped within a sample (as in Samples B and C), 
even if the selection method is the same, the effects of systematic errors (e.g. in the 
zero point of the magnitude calibration) have to be considered. We have therefore 
normalized the different catalogs of Samples B and C by selecting as a reference catalog 
the one with the highest surface density in the sample and reducing the effective cell 
volumes of the remaining ones by the ratio of their surface density (derived from Table 
I) to that of the reference catalog. In this way, we expect to have removed the above 
mentioned systematic effects, leaving a bias, if any, in the sense of underestimating the 
variance. 

Errors in the estimate of the variance, Var(S^), are computed by the quadratic 
sum of two terms: a first one, Varsa^p(E^), accounting for the sampling errors inherent 
in our data, and a second one, Varstai(S^), corresponding to the statistical uncertainty. 
In order to quantify the sampling errors in our data we used a bootstrap resampling 
technique (e.g. Barrow, Bhavsar, & Sonoda 1984) in each separate sub-sample, ac- 
counting for the different densities. The second contribution to the variance of can 
be estimated under the simplifying assumptions that the cells are independent; making 



6 



then use of the Central Limit Theorem one can approximate the underlying distribution 
by a Gaussian with variance cr^ = (see Efstathiou et al. 1990). This results in 

. = 2(l+.a+4y.; + 2(iV,)V^ 

Stan a) Ma{NaY 

This method of calculating Var(S^) allows to deal with catalogs characterized by both 
reduced number of cells (such as Sample C) and dilution effects (Sample A): in the 
former case the larger contribution comes from the theoretical variance, in the latter one 
from the bootstrap error. Note, however, that this method leads to a more conservative 
estimate of error bars, which result typically higher than in previous analyses of the 
variance of counts in cells. 

The final variance, cr^, for the cell counts of QSOs at a given scale, separately for 
samples A, B and C, is obtained by maximizing the likelihood function: 

^ [27rVar(E2)]i/2--[ 2 Var(E2: 
where the product extends over all shells. 
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3. Results and Discussion 

We report the results for the variance of counts in cells of sizes £ = Vu^^ = 60, 80 
and 100 Mpc. For sample B, which is the one of highest density, we can also consider 
cells of size £ = 40 Mpc. All these cells are obtained with parallelepiped-shaped 
geometry, with line-of-sight dimension larger than the transversal ones by a factor of 
1.55, in order to better follow the geometry of the catalogs. 

Figures 1, 2 and 3 show, for the considered cell-sizes, the variance of QSO counts 
in cells for Sample A, B and C respectively, obtained from Eq.(3), with error bars given 
by Var(S^). The maximum likelihood estimates of the variance as a function of the 
cell-size for the three samples separately are reported in Table II; the 70% errors are 
obtained by computing the values of the variance where the likelihood in Eq.(5) drops 
by a factor of 1.71 from its maximum. When the lowest value becomes negative we 
consistently replace it with zero. Table II also shows the values and the number of 
radial shells Ns for each determination. 

Having consistently computed three independent estimates of at various scales, 
the maximum likelihood method [Eq.(5)] can now be used to estimate the overall 
variance, considering all samples together, at 60, 80 and 100 Mpc: we find 
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(7^ = 0.18i^:i5, 0.05t^ o5 and 0.121°;^^, (70% confidence range), respectively. We can 
also compare these data with the estimate of the variance resulting from the QSO cor- 
relation function, ^(r), obtained from Sample A, B and C, separately, according to the 
methods described in Andreani, Cristiani, & La Franca (1991). We fit a spline to ^(r) 
and numerically integrate Eq.(l) and Eq.(2), with a spherical top-hat filter of equivalent 
volume. The results and the errors, obtained by bootstrap resampling, are shown in 
Figure 4 together with the maximum likelihood estimates of cr^ from the counts. Within 
the (70%) error bars, the two methods provide compatible results, although the values 
of cr^ derived from the counts in Sample B are systematically higher. 

In order to evaluate the possible redshift dependence of the clustering amplitude 
we calculated the variance for the two separated redshift ranges 1 < ^ < 1.6 and 
1.6 < z < 2.2. We found that the results are compatible with a constant comoving 
clustering amplitude within the error bars (e.g. Andreani & Cristiani 1992), although a 
slight tendency to have larger variance in the nearest strip occurs in Sample B and C. 

Given the size of the error bars, which in some cases make the results compatible 
with no clustering, we decided to check the presence of real clustering in our data by 
performing a Kolmogorov-Smirnov test against the null hypothesis that our counts are 
drawn from a Poisson parent distribution. To this aim we generated 10, 000 mock Pois- 
son catalogs with the same density, selection function and volume coverage, separately 
for Sample A, B, and C. We then compared the resulting histograms of the cell counts 
with the real ones. They are shown in Figure 5, where the dotted lines correspond to 
histograms of the counts in cell for the Poissonian case and the solid lines to those of the 
real data for the three samples and different cell sizes. The Kolmogorov-Smirnov test 
indicates that the Poisson hypothesis can be rejected at a high confidence level (10~^^ 
at 40 Mpc and down to 10~^ at 100 Mpc), with the only exception of Sample 
A, for which the Poissonian hypothesis cannot be rejected; in this case indeed we found 
that the bootstrap errors dominate the overall variance of S^. 

Our results can be compared with those for the variance of IRAS galaxies in the 
QDOT sample, analyzed by Efstathiou et al. (1990); they find = 0.21^'^{l]. and 
O.OS^o'os) for cubic cells of size 40 and 60 Mpc respectively. Loveday et al. (1992), 
who performed a similar analysis in the Stromlo-APM redshift survey of optical galaxies, 
obtain ct^ = 0.14^°;°^, 0.051°;°^ and 0.02^°;°^, for cubic cells of size 40, 60 and 75 h''^ 
Mpc, respectively. A recent estimate is given by Bouchet et al. (1993) for the 1.2 Jy 
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IRAS Galaxy Redshift Survey; they get the best-fit loga^{R) = (1.17 ±0.05) - (1.59 ± 

0. 06) log i?, for spherical cells of radius R. This corresponds to cr^ Ri 0.09 and 0.05, for 
i = AO and 60 Mpc respectively (having accounted for the different geometry of the 
cells). Note that the (95%) confidence ranges quoted by Efstathiou et al. (1990) and 
Loveday et al. (1992) are obtained by considering only the theoretical part of the error, 

1. e. neglecting sampling fiuctuations, whilst we made the more conservative choice of 
summing up the two uncertainties. 

Our data are compatible, within the errors, with all results above. Nevertheless, 
it could be argued that QSOs are biased over both IRAS and optical galaxies; we 
find bqso/bgal = o'QSo/o'gai in the range 1.4 — 2.3. This effect is indeed predicted 
by hierarchical theories of quasar formation within massive haloes (Efstathiou & Rees 
1988; Cole & Kaiser 1989; Haehnelt & Rees 1993), although the amplitude of such a 
bias strongly depends on the specific model of structure formation. This issue clearly 
deserves more realistic modeling of quasar origin, also taking into account the recent 
observational constraints from large-scale structures, such as the normalization implied 
by COBE data (Smoot et al. 1992). On the other hand, our statistical analysis shows 
that more stringent constraints on quasar clustering will only be obtained when new 
catalogs will be constructed with homogeneous selection criteria and over wider and 
deeper regions of the sky: a goal which can be reached within few years, thanks to the 
availability of multiobject spectrographs. 
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Table I 
Quasar surveys 



survey 


surface* 
sq. deg. 


limiting 
magnitude 


# objects''' 


z range 


# object in 
1 < z < 2.2 


APM* 


516 


mj < 18.5 


1006 


0.2-3.1 


510 


Boyle et al. (1990) 


10.15^ 


mj < 20.9 


320 


0.2-2.2 


236 


Crampton et al. (1989) 


4.8 


mj < 20.5 


135 


0.2-3.1 


87 


HVP 


2 


mj < 21.0 


29 


0.3-2.2 


24 


HVIP 


19 


mj < 19.5 


66 


0.3-2.2 


40 


La Franca et al. (1992) 


10 


ruB < 19.9 


95 


0.35-2.2 


63 


Osmer & Hewett (1991) 


6.1 


mj < 21.7 


113 


0.2-3.1 


66 


Zitelli et al. (1992) 


0.69 


mj < 20.85 


21 


0.6-2.8 


12 



* claimed effective 

^ number of objects with Mb < —23 in a h = 0.5 and Oq = 1 universe. 

* Foltz et al. (1987); Foltz et al. (1989); Hewett et al. (1991); Chaffee et al. (1991); 
Morris et al. (1991) 

^ only 5 out of 8 fields have been used in this work 

* Hawkins & Veron (1993) 
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Table II 



Variance from counts in cells 



Sample 


e {h-^ Mpc) 




(7^ (70%) 




A 


60 


0.22 


0.00 - 0.62 


0.88/11 




80 


0.00 


0.00 - 0.32 


0.32/8 




100 


0.00 


0.00 - 0.24 


0.64/6 


B 


40 


0.46 


0.19-0.73 


6.90/16 




60 


0.27 


0.07-0.47 


2.72/11 




80 


0.13 


0.00 - 0.31 


2.10/8 




100 


0.20 


0.00 - 0.40 


2.09/6 


C 


60 


0.00 


0.00-0.25 


4.39/11 




80 


0.00 


0.00 - 0.22 


1.09/8 




100 


0.10 


0.00-0.37 


0.38/6 
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Figure captions 

Figure 1. The variance cr^ in cells of size t — 60, 80 and 100 Mpc as a function 
of redshift, for sample A. The error bars are one standard deviation, Var(S^). The 
solid line represents the maximum likelihood estimate of the variance; the dotted lines 
correspond to the 70% confidence range. 

Figure 2. The variance cr^ in cells of size £ = 40, 60, 80 and 100 Mpc as a function 
of redshift, for sample B. Error bars and lines as in Figure 1. 

Figure 3. The variance cr^ in cells of size £ = 60, 80 and 100 Mpc as a function of 
redshift, for sample C. Error bars and lines as in Figure 1. 

Figure 4. Comparison of two estimates of the QSO variance for the three samples: 
filled squares refer to the estimate obtained from the counts in cells, open squares to 
the integral of the correlation function. Error bars give the 70% confidence range. For 
clarity, the two different estimates are shown with a small horizontal shift. 

Figure 5. Histograms of the counts in cells for the three separated samples A, B and 
C and for different cell sizes (from 40 to 100 Mpc). The dotted lines correspond 
to the Poissonian case, while the solid ones to the real data. The hypothesis that the 
distribution of objects in cells is compatible with a Poissonian one can be rejected at a 
very high confidence level for samples B and C, but not for sample A. 



